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ANCHOR LTBRARIES AND IPENTIFICAIIQgLQE 
PEPTIDE BIND TNG SEQUENCES 

Field of the Invention 
This invention relates to anchor libraries and to methods of using anchor libraries to 
identify peptide sequences that bind to a target molecule. 

Packground of the Invention 

The identification of peptides which bind to target molecules which are involved in various 
physiological functions, can have significant implications for the diagnosis and/or treatment of 
various abnormal or diseased conditions. For example, a binding peptide might modulate the 
original activity of the target molecule and therefore be usefiil as a drug. 

The use of standard libraries to identify peptide sequences which specifically buid to target 
molecules is generally limited to pre-existing natural sequences firom the organism which is the 
source of the DNA. More recently, libraries have been described which have clones containing 
short synthetic random coding sequences. See, e^, Scott and Smith, Science 249:386-390 
(1990); Cwirla et al., Proc. Natl. Acad. Sci. USA ^2:6378-6382 (1990); Devlin et al.. Science 
249:404-406 (1990). These libraries are mixtures of filamentotis phage clones, each displaymg a 
random peptide sequence on the virion surface. In these types of libraries, the random amino 
acids are contiguous. The size of the peptides that can be screened for binding peptides in such 
contiguous random amino acid libraries is limited, in that as the size of the peptides increases, at 
some point it is not feasible to adequately search such a library since there are too many clones 
required to cover all possible permutations of the random amino acids in the peptides. 

Summary of the Invention 
It is an object of the invention to identify peptide sequences that bind to specific target 
molecules. 

It is another object of the invention to identify amino acid residues in a peptide that are 
important contacts between the peptide and a target molecule. j 

It is another object of the invention to determine where amino acid residues in a peptide 
that are important contacts between the peptide and a target molecule, are best positioned within 
the peptide. 
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It is another object of the invention to use an anchor library in v^ch the random amino 
acid residues of the library are not continuous, for identifying amino acid residues in a peptide 
that are important contacts between the peptide and a target molecule. 

It is another object of the invention to use an anchor library in ^^ch the random amino 
5 acid residues of the library are distributed throughout a much larger peptide domain consisting of 
random glycine and/or alanine residues, for identifying amino acid residues in a peptide that are 
important contacts between the peptide and a target molecule. 

It is another object of the invention to search large peptide phage display libraries of, e.g., 
16 mers, for a reduced nxmiber of essential amino acid residue contacts, e.g., four, between the 
1 0 peptide and a target molecule. 

It is another object of the invention to identify a consensus sequence of a defined number 
of amino acid residues in any configuration of spacer amino acids, that are important contacts 
between a peptide and a target molecule. 

It is yet another object of the invention to use a known core binding sequence on a peptide 
15 wiiich binds to a target molecule, and identify surroimding anuno acid residues which are 
additional important contacts between the peptide and the target molecule. 

Still another object of the invention is to identify cysteine residues on a peptide which can 
form disulfide bridges and thereby increase the binding affmity of the peptide with a target 
molecule. 

20 According to the invention, an anchor library is provided. The anchor library comprises a 

collection of recombinant vectors, e.g.. viruses, phage, e.g., filamentous phage, plasmids or 
cosmids. Each of the vectors has a nucleic acid sequence inserted in a gene, e.g., a coat protein 
gene, e.g., gene III or gene VIII, thioredoxin, staphnuclease, lac repressor, gal4 or an antibody. 
The nucleic acid sequence encodes a displayed peptide sequence, e.g., displayed on the surface 

25 of a virion, cell, spore or gene product, which comprises: 

wherein each X\ X', and X^ is an amino acid residue and any of X*, X^ X^ and X^ can be the 
same or different from any one other, wherein each Y' , Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively, c', c^ and c^ amino acid residues long and 
any of Y\ Y^ and Y^ if present can be the same or diflFerent from any one other, wherein each of 



wo 96/41180 .3. PCr/US96/09383 

c', and preferably is 0 to about 20, more preferably is 0 to about 10, even more preferably is 
0 to about 6, or most preferably is 0 to about 4, wherein X' and X* are each attached to an amino 
acid residue that flanks the displayed peptide sequence. In certain embodiments, at least about 
10* to about 10« permutations of all possible permutations of the displayed peptide sequence are 
present in the anchor library. In other embodiments, the library does not contain more than about 
10%, or more than about 1%, or more than about 0.1%, of displayed peptide sequences different 
from the first mentioned displayed peptide sequences. 

Another aspect of the invention is where each Y'. and is any specified amino acid or 
combination of specified amino acids, e.g., alanine or cysteine or a combination of alanine and 
cysteine; or glycine or cysteine or a combination of glycine and cysteine. 

In certain embodiments, the displayed peptide sequence further has at least one core 
binding sequence which is preferably about 1 to about 20 amino acid residues in length, more 
preferably about 4 to about 10, and most preferably is 6. The core binding sequence can be in 
addition to, or a replacement for. other amino acids in the displayed peptide sequence. 
Variations include the presence of more than one core binding sequence in the displayed peptide 
sequence, where, e.g., the core binding sequences can be adjacent, or not adjacent, to each other, 
and where they can be, e.g., identical or not identical to each other. 

In other embodiments, the displayed peptide sequence further has at least one constraint, 
e.g., a crosslink, e.g., a disulfide bond, e.g., from the presence of a cysteine residue; a stacking 
interaction; a positive or negative charge; hydrophobicity; hydrophiUcity; a structural motif, e.g., 
a zinc finger formation, a leucine zipper, or a p-tum structure, e.g., from the presence of the 
sequence asp gly or pro gly; or combinations thereof Cysteine residues can be in addition to, or 
a replacement for, other amino acids in the displayed peptide sequence. 

Another aspect of the invention is a method of making an anchor library. A coUection of 
nucleic acid sequences is synthesized. The nucleic acid sequences are inserted into vectors to 
give recombinant vectors and the recombinant vectors are introduced into a host The host 
having the recombinant vectors is propagated so as to result in a collection of recombinant 
vectors, each of which has a nucleic acid sequence from the collection of nucleic acid sequences 
which encodes a displayed peptide sequence comprising: 



xMyMcxX'{Y2)c2X3(Y^)c3X^ 
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Another aspect of the invention is a method of using an anchor library to identify a peptide 
sequence that binds to a target An anchor library having a collection of recombinant vectors is 
provided. Each of the recombinant vectors has a nucleic acid sequence vAdch encodes a 
displayed peptide sequence comprising: 

5 Expression and display of the peptide sequence is pennitted. The anchor library is contacted 
with the target, e.g., proteinaceous or non-proteinaceous molecules, e.g., ligands, receptors, 
hormones, cytokines, antibodies, antigens, enzymes, enzyme substrates or viruses, under 
conditions in which the displayed peptide sequence binds to the target, and the displayed peptide 
sequence which binds to the target is identified, e.g., by sequencing the nucleic acid sequence on 

10 the recombinant vector which encodes for the displayed peptide sequence. Preferably, the 
identified displayed peptide sequence is synthesized. 

The invention also provides for a peptide which is identified by use of an anchor library, in 
which the peptide is useful as a diagnostic or therapeutic product in that the peptide is able to 
bind to a target molecule which is involved in a physiological process. 

15 Other aspects of the invention include, e.g., a collection of recombinant DNA molecules 

encoding peptide sequences having a plurality of different binding domains; a recombinant 
filamentous phage having a displayed peptide sequence with known binding properties and 
which is foreign to the filamentous phage; a recombinant vector having a nucleic acid sequence 
inserted in a gene, the nucleic acid sequence encoding a displayed peptide sequence having 

20 known binding properties; a recombinant nucleic acid molecule having a nucleic acid sequence 
inserted in a gene, the nucleic acid sequence encoding a displayed peptide sequence having 
known binding properties; and a recombinant protein having a displayed peptide sequence 
having known binding properties. 

The above and other objects, features and advantages of the present invention will be 

25 better understood from the following specification. 
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Detailed Description 

This invention provides an anchor library. The anchor library comprises a collection of 
recombinant vectors, each of which has a nucleic acid sequence inserted in a gene. The nucleic 
acid sequence encodes a displayed peptide sequence v^diich comprises: 

5 wherein each X', X^, X' and X* is an amino acid residue and any of X', X*, X' and X* can be the 
same or different from any one other, wherein each Y'. and is alanine or glycine or a 
combination of alanine and glycine that is respectively, c', c^ and c^ amino acids residues long 
and any of Y', Y^ and Y^ if present can be the same or different from any one oihet, wherein each 
of c', c^ and c^ is 0 to about 20, wherein X' and X^ are each attached to an amino acid residue that 

10 flanks the displayed peptide sequence. In certain embodiments at least about 1 0* to about 1 0* 
permutations of aU possible permutations of the displayed peptide sequence are present in the 
anchor library. In other embodiments, the library does not contain more than about 1 0%, or 
more than about 1%, or more than about 0.1% of displayed peptide sequences different from the 
first mentioned displayed peptide sequences. 

1 5 By anchor library is meant a library in which the recombinant vectors have nucleic acid 

sequences which code for peptide sequences with random amino acids in which the random 
amino acids are not continuous. An anchor library is thus distinguishable from other random 
amino acid Ubraries in which all random amino acids in the peptide sequence of interest are 
contiguous In anchor libraries, a given number of random amino acids are distributed 

20 throughout a larger peptide domain consisting of specifically designated amino acid residues. 
Anchor libraries are meant to include, e.g., external libraries, e.g., phage display libraries, and 
internal libraries, e.g., plasmid libraries. Chemical libraries can be anchor libraries. 

Vectors are meant to include, e.g., phage, vimses, plasmids, cosmids, or any other suitable 
vector known to those skilled in the art. The vector has a gene, native or foreign, which is able to 

25 tolerate insertion of a foreign peptide into the gene product of the gene. By gene is meant an 
intact gene or fragmem thereof In the invention, the expressed gene product contains the 
inserted peptide. 

For certain embodiments of this invention, e.g., where phage display libraries are 
employed, the preferred vectors are filamentous phage, though other vectors can be used. 
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Filamentoiis phage are single stranded DNA phage having coat proteins. Preferably, the gene that 
the nucleic acid sequence is inserted into is a coat protein gene of the filamentous phage. 
Preferred coat proteins are gene III or gene VIII coat proteins. Insertion of a foreign peptide into 
a coat protein gene results in the display of the foreign peptide on the surface of the phage. 
5 Insertion into any other gene product in ^^ch the inserted peptide is displayed can also be used 
in this invention. Examples of filamentous phage vectors which can be used in this invention are 
fUSE vectors, e.g., fUSEl, fUSE2, fUSE3 and fUSE5, in which the insertion is just downstream 
of the pm signal peptide. Smith and Scott, Methods in Enzymology 217:228-257 (1993). 

In other embodiments, e.g., where intemal Ubraries are employed, the preferred vectors are 

1 0 plasmids, though other vectors can be used. The gene that the nucleic acid is inserted into is a 
gene which also results in display of the inserted peptide sequence. The gene can encode for an 
exported or non-exported gene product. Preferred genes include, e.g., thioredoxin, 
staphnuclease, lac repressor, gal4 or an antibody. 

By recombinant vector is meant a vector having a nucleic acid sequence v/bich is not 

15 normally present in the vector. The nucleic acid sequence is inserted into a gene present on the 
vector. Insertion of a nucleic acid into a gene is meant to include insertion within the gene or 
immediately 5' or 3' to, respectively, the beginning or end of the gene, such that when expressed, 
a fusion gene product is made. The nucleic acid sequence that is inserted includes, e.g., a 
synthesized nucleic acid sequence or a fi-agment of another nucleic acid molecule. The nucleic 

20 acid sequence encodes a displayed peptide sequence. 

By displayed peptide sequence is meant a peptide sequence that is on the surface of, e.g., a 
virion, e.g. a phage or virus, a cell, a spore, or an expressed gene product. It is preferable to have 
the displayed peptide displayed such that it is able to bind to added target molecules. A 
displayed peptide sequence can be identical to, or not identical to, a naturally occurring peptide 

25 sequence. 

The displayed peptide sequence can vary in size. As the size increases, the complexity of 
the anchor library increases, such that at some point a complete library is not obtainable. 
Complete libraries or incomplete libraries can be used in this invention. In certain embodiments, 
the complexity of the anchor library is at least about 10^ to about 10"- Preferably, the complexity 
30 . is at least about 10'. It is preferred that the total size of the displayed peptide sequence (the 
random amino acids plus the spacer amino acids) should not be greater than about 100 amino 
acids long, more preferably not greater than about 50 amino acids long, and most preferably not 
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greater than about 25 amino acids long. A particularly preferred library is made up of displayed 
peptides in which the longest of the peptides is 16 amino acids, i.e., a 16 mer library. 

In large standard libraries, e.g., of 16 mers or greater, it is ordinarily not possible to search 
a library which contains all possible combinations of the 16 random amino acids. A major 

5 advantage of the anchor libraries of this invention is that these large libraries can be searched by 
looking for a reduced number of essential amino acid contacts between the peptides and the 
target Preferably, the number of essential amino acid contacts should be sufiBcient to achieve 
micromolar binding. Preferably, the reduced number of essential contacts is aboutthree to about 
ten, and most preferably it is about four. See Example 4. Thus, e.g., the number of combinations 

10 of four amino acid residue contacts in a 16 mer library is much less than the total number of 
combinations of all 16 amino acids in a 16 mer library, and therefore, this invention makes it 
possible to determine four important contact amino acids in a peptide of 16 amino acids in 
length, as opposed to standard screening of standard libraries m which such determinations 
cannot ordinarily be made. 

15 In one embodiment of the invention, the displayed peptide sequence comprises 

X', X^, X^ and X"* are amino acid residues, each of which can be the same or different firom 
any one of the others. Preferably, the amino acids are chosen from the 20 amino acids 
conunonly found in naturally occurring proteins. 

, and Y^ can be any specified amino acid residue or combmation of specified amino 

20 acid residues, and each of the Ys, if present, can be the same or different from any one of the 
others. Preferably, the amino acids are spacer amino acids ^^ilich will not significantly mterfere 
with the binding between the peptide sequence and a target molecule. It is preferable to use 
combinations of two or more amino acids for the Y amino acids in a given library so as to reduce 
any limitations in the conformations of the displayed peptide that might be imposed by use of 

25 only one given amino acid. Most preferably, glycine and alanine residues are used m 

combination in the library. Glycine and alanine are small side chain amino acids that appear to 
act more as blanks than interfering contacts. In other embodiments, the Y amino acids can be 
amino acids which are chosen because they do significantly affect in some way the binding 
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between the peptide sequence and a target molecule. For example, glycine and cysteine residues 
can be used in combination, or alanine and cysteine residues can be used in combination. 

Y* , and Y^- are, respectively c*, c^ and c^ amino acid residues long, c', c^ and c^ can be 
the same or dififerent from any one of the others. Preferably, each of c^ c^ and c^ is 0 to about 20, 
5 more preferably is 0 to about 10, even more preferably is 0 to about 6, and most preferably is 0 to 
about 4. 

For example, in an anchor library where each of the c's are 0 to 4, and the Y's are a 
combination of glycine and alanine, the minimal structure of the peptide sequence is 4 amino 
acids long (where each of c^ c^ and c^ is 0): 

10 

X»X2X^X^ 

and the maximal structure of the peptide sequence is 16 amino acids long (where each of c', c^ 
and c^ is 4): 

15 

X»(G/A)(G/A)(G/A)(G/A)X^(G/A)(G/A)(G/A)(G/A)X^(G/A)(G/A)(G/A)(G/A)X\ 

where (G/A) is a glycine or alanine residue. This anchor library also contains all other m- 
between permutations of c, e.g., where c* is 0, c- is 1 and cMs 1; where cMs 1, c^ is 1 and cMs 1; 

20 where c' is 2, c^ is 1 and cMs 1 ; etc. All possible permutations of alanine and glycine for each of 
the designated c values are also included in this anchor library. 

It is preferred thai all possible permutations of the displayed sequence are present, that is, 
all combinations of c values and all combinations of, e.g., alanine and/or glycine, for each of the 
c values. In other embodiments, at least about 10^ to about 10* permutations of all possible 

25 permutations are present in the anchor library, or at least about 10^ permutations of all possible 
permutations are present in the anchor library, or at least about 10^ permutations of all possible 
permutations are present in the anchor library, or at least about 10* permutations of all possible 
permutations are present in the anchor library, or at least about 10^ permutations of all possible 
permutations are present in the anchor library, or at least about 10^ permutations of all possible 

30 permutations are present in the anchor library, or at least about 10^ permutations of all possible 
permutations are present in the anchor library. 



r 
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In certain embodiments, the library does not contain more than about 10% of displayed 
peptide sequences different from the first mentioned displayed peptide sequences. In other 
embodiments, the library does not contain more than about 1% of displayed peptide sequences 
different from the first mentioned displayed peptide sequences. And in yet other embodiments, 

5 the library does not contain more than about 0.1% of displayed peptide sequences different from 
the first mentioned displayed peptide sequences. 

In certain embodiments of the invention, the displayed peptide can have additional units of 
XOOc- For example, it can have preferably about 1 to about 10 additional units, more preferably 
about 1 to about 5 additional units, and most preferably about 1 to about 3 additional units. In 

10 other embodiments, one or more additional units ofX alone or (Y)c alone can be present. 

In yet other embodiments of the invention, the anchor libraries described above can have at 
least one core binding sequence, denoted by B, of p amino acid residues in length. B can be any 
size, e.g., from a single amino acid to the size of a gene. Preferably, p is about 1 to about 20, 
more preferably p is about 4 to about 1 0, and most preferably p is about 6. By core binding 

15 sequence is meant a peptide sequence which is known to bind to a target molecule. In certain 
embodiments, the core binding sequence is additional to the amino acid residues of the displayed 
peptide sequences described above. In such libraries, the core binding sequence can be 
positioned on the NHa-terminal or COOH-terminal side of any of the X', X^ X^ or X* amino acid 
residues, or on the NH,-terminal or COOH-terminal side of any of the Y, e.g., alanine or glycine, 

20 residues. In other embodiments, at least one of the X residues is replaced with the core binding 
sequence. In yet other embodiments, at least one of the Y residues, e.g., one of the alanine or 
glycine residues, is replaced with a core binding sequence. Inclusion of a known core binding 
sequence in the anchor library allows identification of surrounding amino acid residues which are 
additional important contacts between the peptide and the target molecule. The invention thus 

25 allowrs identification of better binding sequences by identifying additional amino acids 
surrounding the core binding sequence which in combination witfi the known core binding 
sequence exhibit enhanced binding as compared to the known core binding sequence alone. 

In certain embodiments, more than one known binding sequence is present in each of the 
displayed peptide sequences of the anchor library. These multiple known binding sequences can 

30 be adjacent to, or not adjacent to, each other, and can be identical to, or not identical to, each 
other. 
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In certain embodiments, the anchor libraries have at least one constraint imposed xipon the 
displayed peptide sequence. A constraint includes, e.g., a crosslink, a stacking interaction, a 
positive or negative charge, hydrophobicity, hydrophilicity, a structural motif and combinations 
thereof. In certain embodiments, more than one constraint is present in each of the displayed 
5 peptide sequences of the anchor library. These multiple constraints can be adjacent to, or not 
adjacent to, each other, and can be identical to, or not identical to, each other. 

A crosslink includes, e.g., a disulfide bond. In certain embodiments, the displayed peptide 
has at least one cysteine residue. The cysteine residue can be, e.g., additional to the amino acid 
residues of the displayed peptide sequences described above. In such libraries, the cysteine 
10 residue can be positioned on the NHj-terminal or COOH-temiinal side of any of the X', X^, or 
X* amino acid residues, or on the NH2-terminal or COOH-terminal side of any of the Y, e.g., 
alanine or glycine, residues. In other embodiments, at least one of the X residues is a cysteine 
residue. In yet other embodiments, at least one of the Y residues, e.g., one of the alanine or 
glycine residues, is replaced with a cysteine residue. Multiple cysteines can be present in each of 
15 the peptides so as to form potential disulfide bonds within a random series. Disulfide bonds can 
be formed within the displayed peptide sequence itself or between the displayed peptide 
sequence and the target molecule. 

A structural motif includes, e.g., a zinc finger formation, a leucine zipper, and a P-tum 
structure in the peptide. The sequences asp gly or pro gly are likely to induce P-tums, either 
20 alone or in combination with, e.g., a disxilfide bond. 

In other embodiments, the anchor libraries can be constructed to have both a core binding 
sequence and a constraint, e.g., at least one cysteine residue. In one such embodiment, at least 
one of the X residues can be, e.g., either a cysteine or a glycine such that the displayed peptide 
sequence is: 

25 

(C/G)(Y0c<C/G)(Y2)c2B(aG)(Y3)c3(C/G) 

where (C/G) is a cysteine or glycine residue. In such a library, multiple cysteines are present so 
as to form potential disulfide bonds within a random series. 
30 In yet other embodiments, the displayed peptide sequence comprises: 

X^(yMciX^(y2)^2X^(Y^)^3X^ 
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wherein each Y', V and is alanine or glycine or a core binding sequence B of p amino acid 
residues in length or a combination of alanine and glycine or alanine and B or glycine and B. 
And in yet other embodiments, the displayed peptide sequence comprises: 

wherein each V, Z^ 7} and Z" is an amino acid residue or a core binding sequence B of p amino 
5 acid residues in length and any of Z', 7?, 7} and Z" can be the same or different from any one 
other, and wherein Z' and Z" are each attached to an amino acid residue that flanks the displayed 
peptide sequence. 

Other embodiments include anchor libraries constructed with other configurations of 
combinations between X residues and/or Y residues and/or B sequences and/or cysteine residues 

10 and/or other constraints, as is obvious to those skilled in the art 

The invention also includes a method of making the anchor libraries described above. A 
coUection of nucleic acid sequences is synthesized and inserted into vectors to give recombinant 
vectors. These recombinant vectors are introduced into a host The host having the recombinant 
vectors is propagated so as to result in a collection of recombinant vectors, each of the 

15 recombinant vectors having a nucleic acid sequence from the collection of nucleic acid sequences 
which encodes a displayed peptide seqiience. The peptide sequence is any of the peptide 
sequences discussed above, e.g., X'(Y')c' X^(Y')c2 X'(Y')c3X\ with or without at least one core 
binding sequence, and with or without at least one constraint, e.g., a cysteine residue. In certain 
embodiments, at least about 10' to about 1 0» permutations, or about 10^ permutations, or about 

20 10* permutations, or about 10* permutations, or about 10^ permutations, or about 10* 

permutations, or about 10' permutations, of all possible permutations of the displayed peptide 
sequence are present in the anchor library. In other embodunents, the library does not contain 
more than about 10%, or more than about 1%, or more than about 0.1%, of displayed peptide 
sequences different from the first mentioned displayed peptide sequences. 

25 The nucleic acids that encode the anchor library can be obtained by any method which 

produces the requisite permuted nucleic acids. For example, a spUt synthesis procedure can be 
used. %^ Cormack and Struhl, Science 262:244-248 (1993). Examples 1 and 3 describe 
examples of using split synthesis to make nucleic acid inserts for anchor libraries. 
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The invention further includes a method of using the anchor libraries described above to 
identify a peptide sequence that binds to a target. An anchor library having a collection of 
recombinant vectors, each of which has a nucleic acid sequence which encodes a displayed 
peptide sequence, is provided. The displayed peptide sequence can be any of the peptide 
5 sequences discussed above, e.g., X^(Y% ^(y^c^X^(X^c^X\ with or without at least one core 
binding sequence, and with or without at least one constraint, e.g., a cysteine residue. Expression 
and display of the peptide sequence is permitted. The anchor library is contacted with the target 
under conditions in which the displayed peptide sequence binds to the target, and the displayed 
peptide sequence which binds to the target is identified. 

10 Target is meant to include any molecule with which the displayed peptide sequence will 

bind. Targets include, e.g., proteinaceous and non-proteinaceous molecules. Examples of targets 
are ligands, receptors, hormones, cytokines, antibodies, antigens, enzymes, enzyme substrates 
and viruses. In some cases, the binding peptide modulates the original activity of the target 
molecule, and therefore can be useful as a drug. The target includes, e.g., drug antagonists and 

15 agonists. The binding peptides can be used, e.g., for diagnostic or therapeutic applications. 

The contacting step can be done by any method in which the displayed peptide sequence 
will bind, directly or indirectly, to the target. These methods include, e.g., screens and 
selections. Preferably, an affinity purification method is used. Affinity purification includes, 
e.g., biopanning. For example, a phage anchor library having displayed peptide sequences is 

20 mixed with biotinylated target, resulting in phage:biotinylated target complex if a displayed 
peptide sequence binds to the target. The mixture is added to a streptavidin coated substance, 
e.g., beads or a petri plate. The resulting biotin-streptavidin bond allows isolation of the phage 
carrying peptide sequences that bind to the target.. It is preferable to do multiple rounds of 
biopanning to reduce background. See Example 2. 

25 Identification of the displayed peptide sequence includes, e.g., determining the sequence of 

amino acids that comprise the peptide. Identification can be accomplished, e.g., by amplifying 
the recombinant vector which has the nucleic acid sequence which encodes for the displayed 
peptide sequence which binds to the target, and sequencing the nucleic acid sequence by standard 
procedures known in the art to determine the displayed peptide sequence which binds to the 

30 target. If desired, the peptide thus identified can be synthesized using standard procedures 

known in the art and further tested for its ability to bmd to the target in vitro and/or in cell-based, 
and/or animal models. See Example 2. 
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In a given anchor library, the ability to determine essential amino acid contacts between 
the displayed peptide and a target molecule is aided by the abUity to observe conserved amino 
acid residues in the different displayed peptides which are able to bind to the target. Conserved 
amino acid residues are meant to include different DNA codons for the same amino acid or 
different DNA codons for functionaUy simUar amino acids. The consensus is deteimined by 
comparing the sequence of the individual clones obtained from a Ubrary screen. It is preferable 
that the library have sufficient complexity in order to observe such a consensus. 

Also included in the invention is a peptide identified by use of any of the andior Ubraries 
described above in which the peptide is useful as a diagnostic or therapeutic product in that the 
peptide is able to bind to a target molecule which is involved in a physiological process. For 
example, the target molecule can be a receptor involved in inflammation, e.g., IL-1, or in prostate 
cancer, e.g., GnKH; or the target molecule can be an enzyme, e.g., a protease, e.g., HTV protease. 
By binding to these or other target molecules that are involved in various abnormal conditions or 
diseases, the binding peptides of this invention modulate the original activity of the target 
molecule and are therefore useful as diagnostic or therapeutic products. 

The invention also includes a library which has a coUection of nucleic acid molecules 
encoding peptides having random amino acids, the improvement comprising a library in which 
the random amino acids are not continuous so that the amino acids in the peptide that are 
important contacts for interaction between the peptide and a target molecule can be identified. 

The invention also includes a library having a collection of nucleic acid molecules 
encoding peptides having random amino acids, the improvement comprising nucleic acid 
molecules encoding alanine or glycine or a combination of alanine and glycine residues in 
varying numbers acting as spacers between the random amino acids so that amino acid residues 
in a peptide that are important contacts for interaction between the peptide and a target molecule 
can be identified. 

The invention further provides a collection of recombinant DNA molecules encoding 
peptide sequences having a plurality of different binding domains. The peptide sequences 
comprise: X'(Y')c' ^Wc^ X\ wherein each X', X*, X^ and X* is an amino acid residue 

and any of X', X^, X' and X* can be the same or different from any one other, wherein each Y', 
Y2 and is alanine or glycine or a combination of alanine and glycine that is respectively c', c^ 
and c^ amino acid residues long and any of Y', Y^ and Y' if present can be the same or different 
from any one other, wherein each of c', c^ and c^ is 0 to about 20, wherein X' and X* are each 
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attached to an amino acid residue that flanks the peptide sequence, and wherein at least abom 10^ 
to about 10® permutations, or about 10^ permutations, or about 10^ permutations, or about 10* 
permutations, or about 10^ permutations, or about 10® permutations, or about 10' permutations, 
of all possible permutations of the peptide sequence are present in the coUectioiL In other 
5 embodiments, the collection does not contain more than about 1 0%, or more than about 1 %, or 
more than about 0.1%, of displayed peptide sequences dififerent from the first mentioned 
displayed peptide sequences. In certain embodiments, the peptide sequences are displayed on the 
surfece of a biological material, e.g., a virus, phage, cell, spore or gene product. 

The invention also includes a recombinant filamentous phage having a displayed peptide 

1 0 sequence with known binding properties. The displayed peptide sequence is foreign to the 
filamentous phage. The displayed peptide sequence comprises: X'(Y*)ciX^(Y^)c2X^(Y^)c3X^, 
wherein each X^, X^ and X^ is an amino acid residue and any of X*, X^, X^ and X"* can be the 
same or different from any one other, wherein each Y\ and is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c^ and c^ amino acid residues long and 

1 5 any of and Y^ if present can be the same or different from any one other, wherein each of 
c\ c^ and c^ is 0 to about 20, wherein X' and X"* are each attached to an amino acid residue that 
flanks the displayed peptide sequence, and wherein the displayed peptide sequence is able to 
bind to a target. In certain embodiments, at least one of Y*, Y^ and Y^ is at least about 20 amino 
acid residues long, preferably is at least about 10 amino acid residues long, more preferably is at 

20 least about 6 amino acid residues long, even more preferably is at least about 4 amino acid 
residues long, more preferably yet is at least about 3 amino acid residues long, more preferably 
yet is at least about 2 amino acid residues long, and most preferably is at least about 1 amino acid 
residue long. 

The invention also includes a recombinant vector having a nucleic acid sequence inserted 
25 in a gene. The nucleic acid sequence encodes a displayed peptide sequence having known 
binding properties. The displayed peptide sequence comprises: X*(YOc» X^(Y^)c2X^(Y^)c3X'*, 
wherein each X', X^, X^ and X^ is an amino acid residue and any ofX\ X?, X^ and X^ can be the 
same or dififerent from any one other, wherein each Y*, Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c^ and c^ amino acid residues long and 
30 any of Y\ Y^ and Y^ if present can be the same or different fix)m any one other, wherein each of 
c\ c^ and c^ is 0 to about 20, wherein X* and X* are each attached to an amino add residue that 
flanks the displayed peptide sequence, and wherein the displayed peptide sequence is able to 
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bind to a target. In certain embodiments, at least one of Y', and is at least about 20 amino 
acid residues long, preferably is at least about 10 amino acid residues long, more preferably is at 
least about 6 amino acid residues long, even more preferably is at least about 4 amino acid 
residues long, more preferably yet is at least about 3 amino acid residues long, more preferably 
yet is at least about 2 amino acid residues long, and most preferably is at least about 1 amino acid 
residue long. 

The invention also includes a recombinant nucleic acid molecule having a nucleic acid 
sequence inserted in a gene. The nucleic acid sequence encodes a displayed peptide sequence 
having known binding properties. The displayed peptide sequence comprises: X'(Y')c'X^0^)c2 
X'(Y')c3 X^ wherein each X', X*, X' and X* is an amino acid residue and any of X', X^, X^ and 
X* can be the same or different from any one other, wherein each Y', Y^ and Y' is alanine or 
glycine or a combination of alanine and glycine that is respectively c', c^ and c' amino acid 
residues long and any of Y', Y^ and Y^ if present can be the same or different from any one other, 
wherein each of c', c^ and c^ is 0 to about 20, wherein X' and X* are each attached to an amino 
acid residue that flanks the displayed peptide sequence, and wherein the displayed peptide 
sequence is able to bind to a target. In certain embodiments, at least one of Y', Y^ and Y^ is at 
least about 20 amino acid residues long, preferably is at least about 10 amino acid residues long, 
more preferably is at least about 6 amino acid residues long, more preferably is at least about 4 
amino acid residues long, more preferably yet is at least about 3 amino acid residues long, more 
preferably yet is at least about 2 amino acid residues long, and most preferably is at least about 1 
amino acid residue long. 

The invention fiirther includes a recombinant protein having a displayed peptide sequence 
having known binding properties. The displayed peptide sequence comprises: X'(Y')c'X^0^)c2 
X^(Y')cJX^ wherein each X', X^, X^ and X" is an amino acid residue and any of X', X*, X' and 
X^ can be the same or different from any one other, wherein each Y', Y^ and Y' is alanine or 
glycine or a combination of alanine and glycine that is respectively c', and c' amino acid 
residues long and any of Y', Y* and Y' if present can be the same or different from any one other, 
wherein each of c', c^ and c' is 0 to about 20, wherein X' and X* are each attached to an amino 
acid residue that flanks the displayed peptide sequence, and wherein the displayed peptide 
sequence is able to bind to a target. In certain embodiments, at least one of Y', Y^ and Y' is at 
least about 20 amino acid residues long, preferably is at least about 10 amino acid residues long, 
more preferably is at least about 6 amino acid residues long, even more preferably is at least 
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about 4 amino acid residues long, more preferably yet is at least about 3 amino acid residues 
long, more preferably yet is at least about 2 amino acid residues long, and most preferably is at 
least about 1 amino acid residue long. 

5 gXAMPl.ES 

Example 1 : Construction of a Phage Anchor Library 

This example illustrates the construction of a phage anchor library having random amino 
10 acid codons distributed throughout a domain of alanine and/or glycine codons. Standard cloning 
techniques known to those skilled in the art were used. 

(a) Vector Preparation 

30 ^ig of Fuses (Smith and Scott, Methods in Enzymology 217:228-257 (1993)) was 
15 cleaved with 200 units of endonuclease Sfi I in 500 ^1 of NEB #2 restriction buffer for 10 hours. 
The reaction was terminated with addition of 15 mM EDTA, followed by phenol and chloroform 
extractions. The DNA was recovered by isopropanol precipitation, resuspended in 500 ^1 of TE, 
and recovered by EtOH precipitation. 

20 (b) Insen Preparations 

The anchor insert used in the library was synthesized as a single stranded oligomer xising 
split synthesis. See, e.g. . Cormack and Struhl, Science 262:244-248 (1993). This process creates 
combinations of sequences v^ich differ from each other. 

Using split synthesis, five templates were synthesized and mixed three times to produce 
25 the anchor library: 
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1) GGGCTGCCGGGNNKNNK 

(Seq.ro No. 1) 

2) GGCTGCCGGGNNKGSNNNK 
5 (Seq.ro No. 2) 

3) GGGCTGCCGGGNNKGSNGSNNNK 

(Seq. ID No. 3) 

10 4) GGGCTGCCGGGNNKGSNGSNGSNNNK 

(Seq.ro No. 4) 

5) GGGCTGCCGGGNNKGSNGSNGSNGSNNNK 

(Seq.ro No. 5) 

15 

6) NNK 

7) GSNNNK 

20 

8) GSNGSNNNK 

9) GSNGSNGSNNNK 

(Seq. ID No. 6) 

25 

10) GSNGSNGSNGSNNNK 

(Seq. ID No. 7) 

11) NNKGGTGGTGCTGCTG 

30 (Seq. ID No. 8) 

12) GSNNNKGGTGGTGCTGCTG 

(Seq. ID No. 9) 

35 13) GSNGSNNNKGGTGGTGCTGCTG 

(Seq. ID No. 10) 

14) GSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq. ID No. 11) 

40 

15) GSNGSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq. ID No. 12) 

N = equal mix of G, A, T, C 
45 S = equalmixof G, C 
K = equal mix of G, T 
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DNA was chemically synthesized such that column 1 contained the DNA sequence 
GGGCTGCCGGG (Seq. ID No. 13), followed by DNA encoding a random amino acid, NNK, 
followed by DNA encoding a second random amino acid, NNK. Colimm 2 encoded the DNA 
sequence GGGCTGCCGGG (Seq. ID No, 13), followed by a random amino acid codon, NNK, 
5 foUowed by either a glycine or alanine codon, GSN, and then followed by a random amino acid 
codon, NNK. Columns 3, 4 and 5 encoded the DNA sequence GGGCTGCCGGG (Seq. ID No. 
13), followed by a random amino acid codon, NNK, followed by, respectively, 2, 3 and 4 glycine 
and/or alanine codons, GSN, and then followed by a random amino acid codon, NNK. 

After synthesis of columns 1-5, the resins from the five columns were mixed, resulting in a 

10 pool of oligomers which contained two random amino acids separated by 0 to 4 glycine and/or 
alanine residues. This entire mixture was then split into 5 new columns, denoted 6-10. Each of 
these columns was subjected to further DNA synthesis, resulting in, respectively, codons for 0, 1, 
2, 3 and 4 glycine and/or alanine residues, GSN, followed by a random amino acid, NNK. 
Because the additions of columns 6-10 were conducted on a mixture of resins from columns 1-5, 

1 5 the mixture of columns 6- 1 0 resulted in oligomers that all have three random amino acids, such 
that the neighboring random amino acids are separated by 0 to 4 glycine and/or alanine residues. 

One additional round of split synthesis was undertaken in which the mixtures of columns 
6-10 were extended with 0 to 4 glycine and/or alanine residues, GSN, and one more additional 
random amino acid, NNK, followed by the sequence GGTGGTGCTGCTG (Seq. ID No. 14). 

20 The final mixture of these columns resulted in a series of oligomers with four random amino 
acids such that the neighboring random amino acids are separated by 0 to 4 glycine and/or 
alanine residues. 

Two additional oligomers, pins, CCCGGCAGCCCCGT (Seq. ID No. 15) and 
CAGCACCACC (Seq. ID No. 16), were synthesized which hybridize to the anchor oligomers so 
25 as to reconstruct double stranded DNA near the termini of the insert with three single strand 
nucleotide overhangs corresponding to Sfi I overhangs: 

The insert and pin oligomers were kinased at 10 |ig/30 ^1 kinase buflFer from NEB with 1 
mM ATP at 37**C for 30 minutes, followed by inactivation at 68**C for 5 minutes. The anchor 
oligomer was annealed to the pin oligomers in 500 mM NaCl, 50 mM Tris pH 7.5 at 68 '^C for 10 
30 minutes and cooled to room temperature over 30 minutes. Each of the oligomers was at 5 iiM 
during the annealing. 

It is noted that similar results can be obtained with other 5* and 3' flanking sequences on 
the anchor inserts, and with other corresponding pin sequences altered appropriately, as can be 
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chosen by those skilled in the art. Moreover, other restriction sites can be used as appropriate for 
any given vector, as is known to those skilled in the art. 

(c) Vector Ligation 

30 \ig of DNA vector was ligated to assembled insert at 5 ^g/ml vector and three-fold 
excess assembled insert in NEB ligation buffer with 100 units of T4 DNA ligase at 10°C for 16 
hours. DNA was purified firom ligation buffer by phenol and chlorofonn extractions, followed 
by EtOH precipitation and resuspension in TE. 

(d) DMA Transformation 

DNA was transformed into MC1061 (Wertman et al., Gene 49:253-262 (1986)) 
electrocompetent cells using 0.5 ^g of DNA per 100 ^il of cells using 0.2 cm electroporator cells 
and a BioRad electroporator set at 25 ^F, 2.5 KV and 200 ohms. Shocked cells were recovered 
in SOC media, grown out at 37°C for 20 minutes and inoculated into LB containing 20 ng/ml 
tetracycline, 

(e) Library Phage Isolation 

Phage released firom transformed cells were isolated after growing for 16 hours. Phage 
were separated from cells by centrifugation at 4°C at 4.2K for 30 min. In a Beckman J6, 
followed by a second centrifugation of the supernatant at 4.2K for 30 min. Phage were 
precipitated with the addition of 150 ml at 1 6.7% PEG/3.3 M NaCl per liter of supernatant. 
Mixed solutions were incubated at 4°C for 16 hours. Precipitated phage was collected at 4.2K in 
a J6 followed by resuspension in 40 ml of TBS. Resuspended phage were precipitated again 
with the addition of 4.5 ml of PEG solution for 4 hours. Phage were collected at 5K in a 
Beckman JA20 at 4°C. Phage were suspended in 7 ml of TBS and brought to 1 .3 mg/ml density 
by the addition of 1 gm of CsCl per 2.226 gm of aqueous solution. Phage were subjected to 
equilibrium centrifugation in a type 80 rotor at 45K rpm for 40 hours. Phage bands were 
isolated, diluted 20 fold with TBS and pelleted at 40 K m a Type 50 rotor. Pellets were 
resuspended in 0.7 ml of TBS and used as is for biopanning at approximately 3 x 10*^ phage/ml. 

Example 2: Rinpanning to Select for Peptide Binding Sequences 

This example illustrates biopanning of the phage library obtained from Example 1 to select 
for displayed peptide sequences that bind to biotinylated IL-IB. The phage act as affinity- 
selectable vectors in that the displayed peptide binds specifically to immobUized IL-IB if the 
library contains a displayed peptide that can so interact with IL-IB. 
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(a) Pipdipg 

Biotinylated IL-1 (b-IL-1) (Yew et al., JBC 264(30):1769M7697 (1989)) is incubated 
with 1x10" phage in 20 ^il of TBS for 20 minutes at 22 The phage:(b-IL-l) complex is 
isolated from free phage by addition of streptavadin coated paramagnetic beads for an additional 
5 1 0 minutes. Magnetic beads are collected by attraction with a magnet and washed with TBS 
containing 0.5% Tween-20 for a total of 7 washes over 30 minutes. The remaining phage that 
are bound to the beads (by way of b-IL-1 binding to streptavadin) are recovered by elution with 
100 \il of 100 mM glycine pH 22 for 10 minutes. Eluted phage are neutralized vwfli 1 M Tris 
base. 

10 (b) An^plifica^OA 

Eluted phage are amplified by infection into log phase K91 £. coli (Lyons and Zinder, 
Virology 49:45-60 (1972); Smith and Scott, Methods in Enzymology 217:228-257 (1993)) at an 
moi of 0.0001. Approximately 10^ phage are amplified by plating on 10 LB agar petri dishes 
containing 20 ng/ml tetracycline. The phage released from infected cells, approximately 1 0*^ 
1 5 phage, are harvested by washing the LB agar plates with LB, and purified as above through the 
two PEG precipitations and resuspended at 10" phage/ml. 

Amplified phage are ftirther subjected to two additional romds of biopanning using the 
binding and amplification conditions described above. 

(c) Sequencing Inserts 

20 After three roimds of biopanning, individual phage are isolated and sequenced to reveal the 

DNA sequence that encodes for the displayed peptide in the selected phage. Sequencing is done 
according to manufacturer's protocol for Sequenase 2.0 (United States Biochemical, Cleveland, 
OH 44122). 

(d) Peptide Svnthesis 

25 Peptides representing affinity purified phage are synthesized (Research Genetics, 

Huntsville, AL 35801) and tested for their ability to bind EL-l and effect IL-1 binding to BL-l 
receptor in cell based and animal models. Slack et al., Biotechniques 10:1 132-1 138 (1989). 

Examples : Construction of a Phage Anchor Library Having Codons For a Known Core 
'30 Peptide Binding Sequence 

This example illustrates construction of a phage anchor library which has codons for a 
known core peptide binding sequence which binds to a target molecule, surroxmded by random 
amino acid codons distributed throughout a domain of random alanine and/or glycine codons. 
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10 
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Construction of this type of library is similar to that described in Example 1, except that the 
oligomer constructs not only have the random amino acid codons and glycine and/or alanine 
codons, but also have nucleic acid sequences which code for a known core peptide binding 
sequence, denoted as B: 

1) GGGCTGCCGGGNNKNNK 
(SeqIDNo. 1) 

2) GGGCTGCCGGGNNKGNNNK 
(Seq.IDNo.2) 

3) GGGCTGCCGGGNNKGSNGSNNNK 
(Seq.IDNo. 3) 

4) GGGCTGCCGGGNNKGSNGSNGSNNNK 
(Seq.IDNo. 4) 

5) GGGCTGCCGGGNNKGSNGSNGSNGSNNNK 

(Seq.IDNo. 5) 

20 6) BNNK 

7) BGSNNNK 

8) BGSNGSNNNK 

9) BGSNGSNGSNNNK 
(Seq.IDNo. 6) 



15 



25 



10) BGSNGSNGSNGSNNNK 
30 (Seq.IDNo. 7) 

1 1 ) NNKGGTGGTGCTGCTG 

(Seq. ID No. 8) 

35 12) GSNNNKGGTGGTGCTGCTG 
(Seq. ID No. 9) 

13) GSNGSNNNKGGTGGTGCTGCTG 
(Seq. ID No. 10) 



40 



45 



14) GSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq. ID No. 11) 

15) GSNGSNGSNGSNNNKGGTGGTGCTGCTG 

(Seq.IDNo. 12) 
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The anchor library can also be constructed such that sequence B is located, e.g., before or 
after any of the other NNK or GSN codons. 

Other anchor libraries, containing additions or substitutions of nucleic acid sequences, can 
be constructed using similar methods. For example, codons for cysteine, or any other specified 
5 amino acid or sequence of amino acids, can be substituted for the nucleic acid sequence coding 
for the core binding sequence B in the above-described split synthesis. Anchor libraries 
containing two or more core binding sequences, cysteines, or any other specified amino acid or 
sequence of amino acids, also can be constructed using similar procedures as described, except 
that the multiple additions are synthesized as part of the oligomers at multiple positions, e.g., 
10 each can be located before or after any of the NNK or GSN codons, as can be chosen by one 
skilled in the art. 

Example 4: Four Amino A cid Residues in a Peptide Is Sufficient For Binding to a Target 

15 This example illxistrates that four amino acid residues in a peptide are sufficient for 

micromolar binding between the peptide and its target. 

A hexamer phage library was constructed essentially as described for the anchor libraries, 
except the oligonucleotide was: 

GGGCTGCCGGGNNKNNKNNKNNKJnINKNNKGGTGGTC^ (Seq. ID No. 18). The 

20 library was screened against an antibody to hCG by biopanning as described in Example 2. The 
phage that bound to the antibody contained the consensus sequence XaaThrProTrpXaaGbi (Seq. 
ID No. 17), where X was not absolutely specified. Peptides were synthesized which 
corresponded to the identified sequences and the flanking amino acids found in the phage. These 
peptides had an IC50 of 4.5 jiM compared to 10 nM for hCG. IC50 is equal to the concentration 
25 of peptide necessary to prevent 50% of hCG-I'^^ from binding to the antibody. Therefore, four 
amino acid residues were sufficient to result in |iM binding. 

Those skilled in the art will be able to ascertain, using no more than routine 
experimentation, many equivalents of the specific embodiments of the invention described 
30 herein. These and all other equivalents are intended to be encompassed by the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 (i) APPLICANT: 

(A) NAME: PHARMACEUTICAL PEPTIDES, INC. 

(B) STREET: ONE HAMPSHIRE STREET 

(C) CITY: CAMBRIDGE 

(D) STATE: MASSACHUSETTS 

10 (E) COUNTRY: UNITED STATES OF AMERICA 

(F) ZIP: 02139-1572 

(ii) TITLE OF INVENTION: ANCHOR LIBRARIES AND IDENTIFICATION OF 

PEPTIDE BINDING SEQUENCES 

15 

(iii) NUMBER OF SEQUENCES: 1 8 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Wolf, Greenfield & Sacks, P.C. 
20 (B) STREET: 600 Atlantic Avenue 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02210 

25 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

30 (D) SOFTWARE: Patentln Release #1 .0, Version #1 .25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

35 (C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/479,660 

(B) FILING DATE: 07-JUN-1995 

40 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Greer, Helen 

(B) REGISTRATION NUMBER: 36,816 

(C) REFERENCE/DOCKET NUMBER: P0567/7000WO 

45 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 720-3500 

(B) TELEFAX: (617) 720-2441 



wo 96/41 180 -24- PCTAJS96/09383 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

10 

GGGCTGCCGG GNNKNNK 17 



(2) INFORMATION FOR SEQ ID N0:2: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

GGGCTGCCGG GNNKGSNNNK 20 

25 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3 : 

GGGCTGCCGG GNNKGSNGSN NNK 23 



40 (2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
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GGGCTGCCGG GNNKGSNGSN GSNNNK 



(2) INFORMATION FOR SEQ ID N0:5: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

GGGCTGCCGG GNNKGSNGSN GSNGSNNNK 

15 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
GSNGSNGSNNNK 



30 (2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
40 GSNGSNGSNG SNNNK 



(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Jd) SEQUENCE DESCRIPTION: SEQ ID N0:8: 
NNKGGTGGTG CTGCTG 16 
5 (2) INFORMATION FOR SEQ ED N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 

15 GSNNNKGGTG GTGCTGCTG 19 



(2) INFORMATION FOR SEQ ID NO:10: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GSNGSNNNKG GTGGTGCTGC TG 22 

30 

(2) INFORMATION FOR SEQ ID NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:I1: 

40 

GSNGSNGSNNNKGGTGGTGCTGCTG 25 



(2) INFORMATION FOR SEQ ID NO: 12: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
5 GSNGSNGSNG SNNNKGGTGG TGCTGCTG 



(2) INFORMATION FOR SEQ ID N0:13: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPHON: SEQ ID N0:13: 
(GGGCTGCCGG G 

20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

30 

GGTGGTGCTG CTG 



(2) INFORMATION FOR SEQ ID NO: 1 5: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 
CCCGGCAGCC CCGT 

45 

(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CAGCACCACC 10 

10 

(2) INFORMATION FOR SEQ ID N0:17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino acids 
1 5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

20 XaaThrProTipXaaGln 
1 5 



(2) INFORMATION FOR SEQ ID N0:18: 

25 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
. (B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GGGCTGCCGG GNNKNNKNNK NNKNNKNNKG GTGGTGCTGC TG 42 

35 
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C LAI M S 

1 . An anchor library, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucleic acid sequence inserted in a 
5 gene, said nucleic acid sequence encoding a displayed peptide sequence, 

said displayed peptide sequence of each of said vectors comprising 



wherem each X^ X^, X^ and X* is an amino acid residue and any of X^ X^, X^ and X^ can be the 
same or different from any one other, wherein each Y\ and is alanine or glycine or a 
combination of alanine and glycine that is respectively c*, c^ and c^ amino acid residues long and 
10 any of Y^ Y^ and Y^ if present can be the same or different from any one other, wherein each of 
c^ c^ and c^ is 0 to about 20, wherein X' and X"^ are each attached to an ammo acid residue that 
flanks said displayed peptide sequence, and 

wherein at least about 10^ to about 10* permutations of all possible permutations of said 
displayed peptide sequence are present in said anchor library. 

15 

2. The library of claim 1 wherein said library does not contain more than about 1 0% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 

3. The library of claim 1 wherein said library does not contain more than about 1% of 
20 displayed peptide sequences different from said first mentioned displayed peptide sequences. 

4. The library of claim 1 wherein said library does not contain more than about 0.1% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 

25 5 . The library of claim 1 wherein at least about 1 0^ permutations of all possible 
permutations of said displayed peptide sequence are present in said anchor library. 

6. The library of claim 1 wherein at least about 10* permutations of all possible 
permutations of said displayed peptide sequence are present in said anchor library. 
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7. The library of claim 1 wherein at least about 10^ permutations of all possible 
permutations of said displayed peptide sequence are present in said anchor library. 

8. The library of claim 1 wherein at least about 10^ permutations of all possible 
5 permutations of said displayed peptide sequence are present in said anchor library. 

9. The library of claim 1 wherein said vector is selected from the group consisting of a 
virus, phage, plasnud and cosmid. 

10 10. The library of claim 1 wherein said vector is a filamentous phage. 

1 1 . The library of claim 1 0 wherein said gene that said nucleic acid sequence is inserted in is 
a coat protein gene of said filamentous phage. 

15 12. The library of claim 10 wherein said gene that said nucleic acid sequence is inserted in is 
a filamentous phage gene selected from the group consisting of gene HI and gene VIII. 

13. The library of claim 10 wherein said gene that said nucleic acid sequence is inserted in is 
selected from the group consisting of thioredoxin, staphnuclease, lac repressor, gal4 and an 

20 antibody. 

14. The library of claim 1 wherein said displayed peptide sequence is displayed on the 
surface of a virion. 

25 15. The library of claim 1 wherein said displayed peptide sequence is displayed on the 
surface of a cell. 

1 6. The library of claim 1 wherein said displayed peptide sequence is displayed on the 
surface of an expressed gene product. 

30 

17. The library of claim 1 wherein each of said c', c^ and c^ is 0 to about 10. 



1 8. The library of claim 1 wherein each of said c\ c^ and c^ is 0 to about 6. 
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19. The library of claim 1 wherein each of said c', c^ and c^ is 0 to about 4. 

20. The library of claim 1 further comprising about 1 to about 10 additional units ofX(Y)o 

5 21. The library of claim 1 wherein said displayed peptide sequence is not identical to a 
naturally occumng peptide sequence. 

22. The library of claim 1 wherein said displayed peptide sequence is identical io a naturally 
occurring peptide sequence. 

10 

23. The library of claim 1 wherein said displayed peptide sequence further comprises at least 
one B, said B being a core binding sequence of p amino acid residues in length. 

24. The library of claim 23 wherein p is about 1 to about 20. 

15 

25 . The library of claim 23 wherein p is about 4 to about 10. 

26. The library of claim 23 wherein p is about 6. 

20 27. The library of claim 23 wherein said B is selected from the group consisting of said B 
being on the NHj-terminal side of any of said X\ X\ or amino acid residues, said B being 
on the COOH-terminal side of any of said X', X-, X^ or X^ amino acid residues, said B being on 
the NH2-terminal side of any of said alanine or glycine residues, and said B being on the 
COOH-termmal side of any of said alanine or glycine residues. 

25 

28. The library of claim 23 wherein more than one said B is present. 

29. The library of claim 28 wherein said Bs are adjacent to each other. 

30 30. The library of claim 28 wherein said Bs are not adjacent to each other. 



31. 



The library of claim 28 wherein said Bs are identical to each other. 
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32. The library of claim 28 wherein said Bs are not identical to each other. 



33. The library of claim 1 wherein said displayed peptide sequence further comprises at least 
one constraint selected from the group consisting of a crosslink, a stacking interaction, a positive 

5 or negative charge, hydrophobicity, hydrophilicity, a structural motif and combinations thereof 

34. The library of claim 33 wherein said crosslink is a disulfide bond. 

35. The library of claim 1 wherein said displayed peptide sequence further comprises at least 
10 one cysteine residue. 

36. The library of claim 35 wherein said cysteine residue is selected from the group 
consisting of said cysteine residue being on the NHj-terminal side of any of said X\ X^, or 
amino acid residues, said cysteine residue being on the COOH-terminal side of any of said X^ 

15 X?, X^ or X'* amino acid residues, said cysteine residue bemg on the NHj-tenninal side of any of 
said alanine or glycine residues, and said cysteine residue being on the COOH-terminal side of 
any of said alanine or glycine residues. 

3 7. The library of claim 1 wherein at least one of said X^ , X^, X^ or X'* residues is a cysteine 
20 residue. 

38. The library of claim 1 further comprising at least one B, said B being a core binding 
sequence of p amino acid residues in length, and at least one cysteine residue. 

25 39. The library of claim 38 wherein said displayed peptide sequence comprises: 

(C/G)(Y0ci(C/G)(Y2)c2 B(C/G)(Y3)c3(C/G) 
wherein (C/G) is a cysteine or glycine residue. 

30 

40. The library of claim 1 wherein the complexity of said library is at least about 1 0'. 
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41. An anchor library, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucleic acid sequence inserted in a 
gene, said nucleic acid sequence encoding a displayed peptide sequence, 

said displayed peptide sequence of each of said vectors comprising 

wherein each X\ X?, X^ and X"* is an amino acid residue and any of X^ X^, X^ and X^ can be the 
same or different from any one other, wherein each Y*, Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c*, c^ and c^ amino acid residues long and 
any of Y*, Y^ and Y^ if present can be the same or different from any one other, wherein each of 
c\ c^ and c^ is 0 to about 20, wherein X^ and X'* are each attached to an anmio acid residue that 
flanks said displayed peptide sequence, and 

wherein said library does not contain more than about 10% of displayed peptide 
sequences different from said first mentioned displayed peptide sequences. 

42. A method of making said anchor library of claim 1, comprising: 
synthesizing a collection of nucleic acid sequences; 

inserting said nucleic acid sequences into vectors to give recombinant vectors; 
introducing said recombinant vectors into a host; 

propagating said host having said recombinant vectors so as to result in a collection of 
recombinant vectors, each of said recombinant vectors having a nucleic acid sequence from said 
collection of nucleic acid sequences which encodes a displayed peptide sequence; 

said displayed peptide sequence comprising: 

xMyMciX^(Y^)c2X^(y')^3X' 

v^^erein each X^ X^, X^ and X^ is an amino acid residue and any of X^ X^, X^ and X"* can be the 
same or different from any one other, wherein each Y\ Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c^ and c^ amino acid residues long and 
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any of Y*, and if present can be the same or different from any one other, wherein each of 
c', and is 0 to about 20, wherein X' and X"* are each attached to an amino acid residue that 
flanks said displayed peptide sequence, and 

wherein at least about 10^ to about 10^ permutations of all possible permutations of said 
5 displayed peptide sequence are present in said anchor library. 

43. The method of claim 42 wherein said library does not contain more than about 1 0% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 

10 44. A method of using said anchor library of claim 1 to identify a peptide sequence that binds 
to a target, comprising: 

providing said anchor library of claim 1, said anchor library having a collection of 
recombinant vectors, each of said recombinant vectors having a nucleic acid sequence which 
encodes a displayed peptide sequence comprising 

15 permitting the expression and display of said peptide sequence; 

contacting said anchor library with said target imder conditions in which said displayed 
peptide sequence binds to said target; and 

identifying said displayed peptide sequence which binds to said target. 

20 45. The method of claim 44 wherein said contacting step is by affinity purification, 

46. The method of claim 44 wherein said identifying step comprises amplifying said 
recombinant vector which has said nucleic acid sequence vMch encodes for said displayed 
peptide sequence which binds to said target, and sequencing said nucleic acid sequence to 
25 determine said displayed peptide sequence which binds to said target 



47. The method of claim 44 further comprising the step of synthesizing said identified 
displayed peptide sequence. 
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48. The method of claim 44 wherein said target is selected from the group consisting of 
proteinaceous molecules and non-proteinaceous molecules. 

49. The method of claim 44 wherein said target is selected from the group consisting of 

5 ligands, receptors, hormones, cytokines, antibodies, antigens, enzymes, en2yme substrates and 
viruses. 

50. The method of claim 44 wherein said peptide sequence further comprises at least one B, 
said B being a core binding sequence of p amino acid residues in length. 

10 

5 1 . The method of claim 44 wherein said peptide sequence further comprises at least one 
cysteine residue. 

52. The method of claim 44 wherem at least one of said X?, and X'* residues is a 
15 cysteine residue, 

53. A peptide identified by use of said library of claim 1 which peptide is useful as a 
diagnostic or therapeutic product in that said peptide is able to bind to a target molecule which is 
involved in a physiological process. 

20 

54. In a library having a collection of nucleic acid molecules encoding peptides having 
random amino acids, the improvement comprising a library in which the random amino acids are 
not continuous so that the amino acids in a peptide that are important contacts for interaction 
between said peptide and a target molecule can be identified. 

25 

55. In a library having a collection of nucleic acid molecules encoding peptides having 
random amino acids, the improvement comprising nucleic acid molecules encoding alanine or 
glycine or a combination of alanine and glycine residues in varying nxmibers acting as spacers 
between the random amino acids so that amino acid residues in a peptide that are important 

30 contacts for interaction between said peptide and a target molecule can be identified. 



56. A collection of recombinant DNA molecules encoding peptide sequences having a 
plurality of different binding domains, said peptide sequences comprising; 
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wherein each X*, X^, X^ and X^ is an amino acid residue and any of X^ X^, X^ and X'* can be the 
same or different from any one other, wherein each Y', Y^ and Y^ is alanine or glycine or a 
combmation of alanine and glycine that is respectively c\ c? and c^ amino acid residues long and 
any of Y^ Y^ and Y^ if present can be the same or different from any one other, wherein each of 
5 c\ c^ and c^ is 0 to about 20, wherein X* and X* are each attached to an amino acid residue that 
flanks said peptide sequence, and 

wherein at least about 10^ to about 10* permutations of all possible permutations of said 
peptide sequence are present in said collection. 

10 57. The collection of recombinant DNA molecules of claim 56 wherein said collection does 
not contain more than about 10% of displayed peptide sequences different from said first 
mentioned displayed peptide sequences. 

58. The collection of recombinant DNA molecules of claim 56 wherein said peptide 

15 sequences are displayed on the surface of a biological material selected from the group consisting 
of a virus, phage, cell, spore and gene product. 

59. A recombinant filamentous phage having a displayed peptide sequence with known 
binding properties, 

20 said displayed peptide sequence being foreign to said filamentous phage, 

said displayed peptide sequence comprising: 

xMyMciX^(Y2)^2X^{Y^)^3X^ 

wherein each X', X^, X^ and X^ is an amino acid residue and any of X^ X^, X^ and X^ can be the 
same or different from any one other, wherein each Y\ Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c^ and c^ amino acid residues long and 
25 any of Y', Y^ and Y^ if present can be the same or different from any one other, ^^ilerein each of 
c*, c^ and c^ is 0 to about 20, wherein at least one ofY\ Y^ and Y^ is at least about one amino 
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acid residue long, wherein X' and are each attached to an amino acid residue that flanks said 
displayed peptide sequence, and 

said displayed peptide sequence being able to bind to a target. 

5 60. A recombinant vector having a nucleic acid sequence inserted in a gene, said nucleic acid 
sequence encoding a displayed peptide sequence having known binding properties, 
said displayed peptide sequence comprising: 

xMy^)^iX2{y2)^2X^(Y^)^3X' 

wherein each X', X^, X^ and X^ is an amino acid residue and any ofX\ X^. X^ and X"* can be the 
same or different from any one other, wherein each Y*, Y^ and Y^ is alanine or glycine or a 

10 combination of alanine and glycine that is respectively c\ c^ and c^ amino acid residues long and 
any of Y^ Y^ and Y^ if present can be the same or different from any one other, wherein each of 
c', c^ and c^ is 0 to about 20, wherem at least one of Y^ Y^ and Y^ is at least about one amino 
acid residue long, wherein X* and X"* are each attached to an amino acid residue that flanks said 
displayed peptide sequence, and 

15 said displayed peptide sequence being able to bind to a target. 

61 . A recombinant nucleic acid molecule having a nucleic acid sequence inserted in a gene, 
said nucleic acid sequence encoding a displayed peptide sequence having known binding 
properties, 

20 said displayed peptide sequence comprising: 

xMyMciX^{Y^)c2X^{Y^)^3X^ 

wherem each X*, X^, X^ and X"* is an amino acid residue and any of X*, X^, X^ and X^ can be the 
same or different from any one other, wherein each Y', Y^ and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c^ and c^ amino acid residues long and 
any of Y', Y^ and Y^ if present can be the same or different from any one other, wherein each of 
25 c^ c^ and c^ is 0 to about 20, wherein at least one of Y^ Y^ and Y^ is at least about one amino 
acid residue long, wherein X* and X"* are each attached to an amino acid residue that flanks said 
displayed peptide sequence, and 
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62. A recombinant protein having a displayed peptide sequence having known binding 
properties, 

5 said displayed peptide sequence comprising: 

wherein each X^, X? and X^ is an amino acid residue and any of X', X^, X^ and X^ can be the 
same or different from any one other, wherein each and Y^ is alanine or glycine or a 
combination of alanine and glycine that is respectively c^ c- and c^ amino acid residues long and 
any of Y*, Y^ and Y^ if present can be the same or different from any one other, wherem each of 
10 c\ c^ and c^ is 0 to about 20, wherein at least one of Y', Y^ and Y^ is at least about one amino 
acid residue long, wherein X' and X^ are each attached to an amino acid residue that flanks said 
displayed peptide sequence, and 

said displayed peptide sequence being able to bind to a target. 

15 63. An anchor librae', comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucleic acid sequence inserted in a 
gene, said nucleic acid sequence encoding a displayed peptide sequence, 

said displayed peptide sequence of each of said vectors comprising 

xMy^^xX'(y2)^2X^(Y^)^3X' 

20 wherein each X\ X^, X^ and X"* is an amino acid residue and any ofX\ X^, X? and X* can be the 
same or different from any one other, wherein each Y\ Y^ and Y^ is any specified amino acid or 
combination of specified amino acids that is respectively c\ c^ and c^ amino acid residues long 
and any of Y^ Y^ and Y^ if present can be the same or different from any one other, wherein each 
of c', c^ and c^ is 0 to about 20, wherein X' and X"* are each attached to an amino acid residue that 

25 flanks said displayed peptide sequence, and 
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wherein at least about 10^ to about 10^ permutations of all possible permutations of said 
displayed peptide sequence are present in said anchor library. 



64. The library of claim 63 wherein said library does not contain more than about 10% of 
displayed peptide sequences different j&om said first mentioned displayed peptide sequences. 

65. The library of claim 63 wherein each Y^ and is alanine or cysteine or a 
combination of alanine and cysteine. 

66. The library of claim 63 wherein each Y', Y^ and Y^ is glycine or cysteine or a 
combination of glycine and cysteine. 

67. An anchor library, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucleic acid sequence inserted in a 
gene, said nucleic acid sequence encoding a displayed peptide sequence, 

said displayed peptide sequence of each of said vectors comprising 

xMyMciX'(Y^)^2X^{Y^)^3X^ 

wherein each X', X^, X^ and X'* is an amino acid residue and any of X', X?, X^ and X'* can be the 
same or different from any one other, wherein each Y', Y^ and Y^ is alanine or glycine or a core 
buiding sequence B of p amino acid residues in length or a combination of alanine and glycine or 
alanine and B or glycine and B, that is respectively c^ c^ and c^ amino acid residues long and any 
of Y', Y^ and Y^ if present can be the same or different from any one other, wherein each of c*, c^ 
and c^ is 0 to about 20, wherein X* and X^ are each attached to an amino acid residue that flanks 
said displayed peptide sequence, and 

wherein at least about 10^ to about 10* permutations of all possible permutations of said 
displayed peptide sequence are present in said anchor library. 

68. The library of claim 67 wherein said library does not contain more than about 10% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 
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69. An anchor library, comprising: 

a collection of recombinant vectors, 

each of said recombinant vectors having a nucleic acid sequence inserted in a 
gene, said nucleic acid sequence encoding a displayed peptide sequence, 
5 said displayed peptide sequence of each of said vectors comprising 

wherein each Z*, Z^, 7} and Z^ is an amino acid residue or a core binding sequence B of p amino 
acid residues in length and any oiZ\ 7}, 7) and can be the same or different from any one 
other, wherein each Y\ and Y^ is alanine or glycine or a combination of alanine and glycine 
that is respectively c', c^ and c^ amino acid residues long and any of Y', Y^ and Y^ if present can 
10 be the same or different from any one other, wherein each of c', c^ and c^ is 0 to about 20, 

wherein Z' and Z* are each attached to an amino acid residue that flanks said displayed peptide 
sequence, and 

wherein at least about 10^ to about 10^ permutations of all possible permutations of said 
displayed peptide sequence are present in said anchor library. 

15 

70. The library of claim 69 wherein said library does not contain more than about 10% of 
displayed peptide sequences different from said first mentioned displayed peptide sequences. 
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3. I I Claims Nos.: 
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The inventions listed as Groups I-V do not relate to a single inventive concept under PCT Rule 13.1 because, under 
per Rule 13.2, they lack the same or corresponding special technical features for the following reasons: The only 
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